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ABSTRACT 


This report documents research activities which were conducted under 
the auspices of National Aeronautics and Space Administration Cooperative 
Agreement NCC 9-9. During this contract period research efforts were con- 
centrated in two primary areas. The first area was an investigation of the 
use of measurement error models as alternatives to least squares regression 
estimators of crop production or timber biomass. The second primary area 
of investigation was on the estimation of the mixing proportion of two- 
component mixture models. This report documents publications, technical 
reports, submitted manuscripts, and oral presentations which occurred as a 
result of these research efforts. Possible future areas of fruitful 


research are mentioned. 
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I. MAJOR RESEARCH ACTIVITIES 


Research supported by this cooperative agreement primarily focuaaed on 
two major topics: regression estimation using models in which both the 
response and the predictor variables are subject to measurement errors and 
the estimation of parameters in mixture models. The first topic was 
investigated as a means of improving satellite remote-sensing estimates of 
crop production and timber blomaBS by combining satellite estimates with 
ground truth measurements. The latter topic was explored in an effort to 
Improve the estimation of parameters in models of crop or vegetation pro- 
portions when there are two or more different types of vegetation in a 
segment . 

Several research publications resulted from the research on these and 
other topics. Six publications appeared in referred scientific journals 
and another three manuscripts were published as technical reports or pro- 
ceedings papers. In addition, four manuscripts are currently submitted for 
publication, the last of which is included in this report as Appendix C. A 
complete list of publications and submitted manuscripts is given in 
Appendix A. Ten oral presentations, listed in Appendix B, were given to 
further disseminate the results of this research. 

A summary of the major topics of research which were undertaken 
through the support of this cooperative agreement are presented in the next 


three sections. 


A. MEASUREMENT ERROR MODELS 


The primary focus of the research conducted under the auspices of this 
cooperative agreement was on the investigation of techniques for Improving 
the use of satellite remote-sensing estimates of crop proportions and 
timber biomass. Specific attention was devoted to the combining of 
relatively inexpensive but imprecise satellite estimates with relatively 
expensive but highly precise ground-truth estimates. Although least 
squares regression estimates had been used for some time in this effort, it 
was recognized that both the satellite estimates and the ground truth 
estimates are subject to measurement error, thereby Invalidating one of the 
key assumptions needed for the use of least squares estimation. Thus 
Intense research activities were directed toward the investigation of 
regression estimation with measurement error models. 

Denote a true (i.e., error-free) ground-truth measurement by Y and the 
corresponding error-free satellite measurement by X. Assume that an 
adequate representation of the relationship between the two measurements is 
given by a linear model of the form 

Y - a + BX. 

Because of errors of measurement (e.g., registration errors, irregultrity 
shaped fields, etc.), the true ground-truth and satellite measurements at a 
not observed. Rathei , one observes 

x ■ X ♦ u and y ■ Y + v , 

where u and v represent the measurement errors. In this framework the 
least squares estimators of a and B are biased since an underlying 
assumption which is necessary for unbiased least squares estimation is that 
the observable predictor variable x is measured without error. 
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To date, research has concentrated on structural neasureaent error 
models; i.e., the true predictor variable X is stochastic. It Is well 
known that If X, u, and v are normally distributed there does not exist 
consistent estimators of a and B unless (a) one or more of the model 
parameters 1 r known, (b) replicated observations are available , or (c) 
measurements are taken on additional (instrumental) variables which are 
correlated with the true predictor variable X but not with the errors u and 
v. 

Assuming independent normal distributions for X, u, and v with 
X ■ var( v)/var(u) known, an investigation was conducted into the effects of 
sample size on the precision of the maximum likelihood estimator of B and 
on the consequences of selecting an erroneous value for X. The results of 
this research were reported in publication 1(b) (Appendix A) and technical 
reports 2(a) and 2(c). A parallel investigation when the errors u and v 
are correlated is reported in publication 1(d). A summary of many of the 
properties of these estimators and a comparison with least squares is 
reported in the manuscript 3(b). 


B. MIXTURE MODELS 

Mixture models are used to probabilistically characterize the occurrence 
of spectral measurements from segments in which two or more crops or other 
vegetation are present. If x denotes a (possibly vector-valued) spectral 
measurement from a segment in which two crops are present, the probability 
density function f^(x) can be expressed as 

f 0 (x) * pgj(x;^ ] ) + ( l-p)g 2 (x;© 2 ) 
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where g^(x;8^) is the density function for x in crop j, 8 » (p,^,^)' ■ 

vector of model parameters (with 8^ possibly vector-valued), and p is the 

proportion of crop 1 in the segment. 

Estimation of 8 presents challenging and, to date, many unsolved 

problems. During the course of this cooperative agreement progress was 

achieved in the estimation of the mixing proportion p. The manuscript 3(a) 

examines maximum likelihood and minimum distance estimation of the mixing 

proportion p when the two component distributions g (Xj;4 ) are represented 

by three-parameter Weibull distributions. The three-parameter Weibull was 

used because of the variety of shapes it can have by specifying values of 

the parameters. In manuscript 3(a) the distance measure used was the 

2 

Cramer-von-Mi ses distance W , whereas in manuscript 3(d) (see also Appendix 
C) the Hellinger distance was used with normal component distributions. 


C. OTHER TOPICS 

Additional research which was completed during the duration of this 
cooperative agreement is reported in Appendix A. Several papers were 
published (1(a), 1(e), 1(f), 2(b)) or submitted for publication (3(c)) on 
regression was with collinear variables. One article (1(c)) was published 
on the use of quadratic forms in screening procedures. 
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II. PROSPECTIVE FUTURE RESEARCH 

The research outlined in this report has not only led to the publica- 
tion of several scholarly articles, it has also identified topics which 
offer great potential for further productive research efforts. Some of the 
general areas of possible future research activities are now briefly 
outlined. 

Many Important problems remain unresolved when estimating the 
parameters of measurement error models. While there is theoretically no 
problem with estimating the parameters of measurement error models when the 
true predictor variable X follows some nonnormal probability distribution, 
little work has been conducted on the implementation of maximum likelihood 
(or minimum distance) estimation in this setting. Likewise, no work has 
been done on evaluating the effects of sample size on such estimation 
procedures . 

Another potential area of research is the extension of the results 
reported above to regression models having more than one predictor 
variable. Questions relating to the choice of error variance ratios and 
the consequences of misspecifying these ratios require theoretical and 
simulation investigations. Again, the difficulty with Implementing 
estimation procedures other than least squares when the predictor variables 
are nonnormal must be addressed. 

Associated with the estimation of the parameters of measurement error 
models is the use of fitted models for prediction and calibration. Only a 
few published articles have appeared on prediction and calibration with 


measurement error models. 


The other major topic which was Investigated during the course of this 
cooperative agreement, mixture model estimation, likewise poses many 
problems for potentially fruitful future research. In the current research 
only the estimation of the mixing proportion p was studied. Much work 
remains before acceptable estimation of all the model parameters can be 
achieved. So too, estimation procedures for three or more component 
distributions need to be developed. 

A great deal of work remains to be done on the selection of component 
distributions for u>e with the mixture model. Ml imum distance estimation 
does not necessitate that the component distributions be the "true" ones in 
order to satisfactorily estimate crop proportions. During the investiga- 
tion of the Weibull components it was discovered that several widely 
differing sets of parameters could yield mixture distributions which were 
virtually identical. It would be extremely useful to identify component 
families of distributions for which parameter estimation is computationally 
efficient and which are "stable". The stability of estimates would require 
that small perturbations of the data would not result in widely differing 
perameter estimates and that radically different parameter choices could 


not produce virtually identical mixtures. 
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III. ORAL PSESERTATIORS AID OTHER ACTIVITIES 

Appendix B lists ten oral presentations of the research conducted 
during this cooperative agreement. These oral presentations permitted 
rapid 'ilssemlnation of the major accomplishments of this research. 
Presentations (8), (9), and (10) will be acknowledged as "Outstanding 
Contributed Paper Presentations" by the Section on Physical and Engineering 
Sciences of the American Statistical Association (ASA) during the 1986 
Annual ASA Meetings next August. 

Funding from this cooperative agreement was used for partial support 
of the principal investigator during the summer months of 1984-1985. Also 
partially supported during the summer months of 1985 was Professor Wayne A. 
Woodward, who led the mixture model research. Three advanced graduate 
students were also partially supported during the duration of this 
cooperative agreement: Mr. Kelly Cunningham, Miss Miriam Reilman and Dr. 
Many Lakshminarayanan . Dr. Lakshminarayanan completed hlB doctoral degree 


requirements while supported by funds from this contract. 
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IV. APPEKDICES 


A. COMPLETED RESEARCH 


B. ORAL PRESENTATIONS 


C. ADDITIONAL WORK 


A. COMPLETED RESEARCH 


1. Publications in Refereed Journals 

(a) "Toward a Balanced Assessment of Colllnearlty Diagnostics," 

The American Statistician , 38 (1984), 79-82. 

(b) "Estimation of Parameters in Linear Structural Relationships: 
Sensitivity to the Choice of the Ratio of Error Variances," 
Biometrlka , 74 (1984), 569-573. 

(c) "Screening Procedures Using Quadratic Forms," Communications 
in Statistics . A14 (1984), 1393-1404. 

(d) "Structural Model Estimation with Correlated Measurement 
Errors," Biometrlka . 72 (1985) to appear. 

(e) "Outlier-Induced Collinearities" , Technometrics , 27 (1985), 
to appear. 

(f) "Selecting Principal Components in Regression," Statistics and 
Probability Letters 3 (1985), 299-301. 


2. Technical Report and Proceedings Papers 


(a) "Sensitivity of Errors-in-Variables Estimators to the 

Specification of the Ratio of Error Variances," Technical 
Report, NASA Johnson Space Center, Houston, TX (1983). 


(b) "Regression Diagnostics and Approximate Inference Procedures 

for Penalized Least Squares Estimators," Department of Statistics 
Technical Report No. 181, SMU, Dallas, TX (1983). 


(c) 


"Exploring the 
Remote- Sen sing 
NASA Symposium 
Analysis, NASA 


Use of Linear Structural Models to Improve 
Agricultural Estimates, Proceedings of the 
on Mathematical Pattern Recognition and Image 
Johnson Space Center, Houston, TX (1934). 


# 


10 


3. Research Articles Submitted for Publication 

(a) "Estimating Mixture Proportions for Component Welbull 
Distributions." 

(b) "Stochastic Regression with Errors In Both Variables". 

(c) "Diagnostics for Penalised Least Squares Estimators" 

(d) "Minimum Helllnger Distance Estimation of Mixture Model 

Parameters: A Re-Examination" 


B. ORAL PRESENTATIONS 


1 . "Exploring the Use of Linear Structural Models to Improve Remote- 
Sensing Agricultural Estimates," NASA Symposium on Mathematical 
Pattern Recognition and Image Analysis, June 6-8, 1984, Johnson 
Space Center, Houston, Tx. 

2. "Effects of Misspeclfying the Error Variance Ratio in Linear Structural 
Relationships." Joint Annual Meetings of the American Statistical 
Association and the Biometric Society, Philadelphia, PA, August 13-16, 

1984. 

3. "Collinearity Assessment with Errors-in-Variables Models." Joint 
Annual Meetings of the American Statistical Association and the 
Biometric Society, Philadelphia, PA, August 13-16, 1984. 

4. "Regression with Collinear Predictor Variables: Implications for Causal 
Inference", Department of Quantitative Business Analysis, Louisiana 
State University, October 18, 1984; Department of Mathematics, 

Northern Arizona University, November 9, 1984. 

5. "Regression Estimation with Linaar Structural Models," Division of 
Mathematical Sciences, University of Texas at Dallas, February 14, 

1985. 

6. "Regression Models when Both Variables are Subject to Measurement 
Errors". Spring Regional Meeting of the Biometric Society (ENAR), 
Raleigh, NC, March 25-27, 1985. 

7. "Measurement Error Models," Mathematics Department, General Motors 
Research Laboratories, July 22, 1985. 

8. "Replication and Instrumental Variables Estimators for Linear Structural 
Models." Annual Meetings of the American Statistical Association and 
the Biometric Society, Las Vegas, NV, August 5-8, 1985. 

9. "Regression Estimation with Controlled Observations". Annual Meetings 
of the American Statistical Association and the Biometric Society, 

Las Vegas, NV, August 5-8, 1985. 

10. "Maximum Likelihood as Least Squares in Structural Model Estimation." 
Annual Meetings of the American Statistical Association and the 
Biometric Society, Las Vegas, NV, August 5-8, 1985. 


C. ADDITIONAL WORK 


"Minimum Helllnger Distance Estimators 
of Mixture Model Parameter: 

A Re-Examination" 


MINIMUM HELLINGER DISTANCE ESTIMATION 


OF MIXTURE MODEL PARAMETERS 
A RE-EXAMINATION 

Wayne A. Woodward 

1 . Introduction 

The problem of using minimum Bellinger distance estimation suggested 
by Beran (1977) for purposes of estimating mixture model parameters was 
initially discussed by Woodward and Eslinger (1983). The use of the mini- 
mum Hellinger distance estimator (MHDE) is intuitively appealing due to 
the fact that it asymptotically efficient and asymptotically normal in 
various settings (see Beran (1977), Stather (1981)) yet it has been shown 
to be robust to departrues from normality (see Eslinger (1984)). It was 
believed that its use in the mixture of normals setting often assumed in 
crop proportion estimation could provide efficient estimates when Lhe 
model is correct along with robustness to departures from normality. 
Woodward, et. al (1982, 1 9tj 3 , 1984) studied the use of minimum distance 
estimation of the mixture model pa ameters using Cramer-von Mlses 
distance and found that the Cramer-von MlseB distance estimator 
(MCVMDE) provided results superior to the maximum likelihood estimator 
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(MLE) under departrues from coaponent normality but yielded decidely 
poorer results than the MLE when the assumption of coaponent normality was 
true. 

Woodward and Esllnger (1983) showed eaplrically that the MHDE does 
provide estimators comparable to the MLE under coaponent normality which 
at the same time show some robustness to departures from normality. The 
MHDE was not, however, as robust as the MCVMDE, a result also shown by 
Esllnger (1984) for the two parameter normal. These results are to be 
expected since in essence the MHDE is a compromise between the very robust 
estimator which is not efficient at the true model and the efficient 
estimator such as the MLE which is not robust. However, Woodward and 
Esllnger (1983) encountered problems in implementing the MHDE in the 
mixture setting- In particular the implementation of the MHDE used by 
those authors tended to be extremely sensitive to starting value? which 
resulted in "failure to converge" problems an unacceptable high percentage 
of the time. In particular for mixtures with overlap, as defined by 
Woodward et al . (1984) of .1 the MHDE converged on the average about 82% 
of the time and converged 88% of the time for .03 overlap. 

In this report we re-examine the use of the MHDE in the mixture of 
normals setting. We empirically examine the effect of the selection of 
the smoothing parameter h used in the kernel density estimation component 
of the MHDE, and we examine the use of an alternative maximization 
scheme. In this report as in the earlier reports by this author, we will 
be concerned only with the estimation of the mixture proportion. 
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v: 


In Section 2 we provide a brief description of the MdDE in the 
mixture-of-normals setting as implenented by Voodward and Esllnger (1983). 
Section 3 provides details concerning new approaches designed to improve 
the convergence of the MHDE. In Section 4 we give details of simulations 
examining the new techniques. 


2. The Minimum Hellinger Distance Estimator 
Let F. - (F a :0t0) denote a family of distributions which will be 
referred to as the projection family. These distributions depend on the 
possibly vector valued parameter 8. We will assume for our purposes here 
that the distributions in F. are absolutely continuous. In our case we 

v 

use the mixture of normals projection model 


f e (x) 


_E_ 


1 

" 2 

e 1 


1 


(1-P) 

•/n a 2 



x-p 

0 



( 2 . 1 ) 


where 8 ■ (p^ , 0 ^ ^ ,p) ' are all assumed to be unknown. An 

MHDE of 0 is a value 1) which minimizes a "distance" between the data 

distribution (whose model is unknown) and the projection model. The 

distance measure used in MHD estimation is the Hellinger distance. The 

Hellinger distance between two absolutely continuous distributions iB 
1/2 1/2 

defined to be | | f -g | | where f and g denote the corresponding 
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density functions and ||*|| denotes the L norm, i.e. 


iif 1/2 - s 1/2 i 


[/ ( f 1/2 - e 1/2 >W /2 


( 2 . 2 ) 
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where Integration le with respect to Lebesgue measure on the real line. 

The MHD estimator '$ u of 0 is defined as a value of 8 which minimises 

n 

1/2 1/2 

||f. -g || where g is a nonparametrlc density estimator. We will 
on n 

employ a kernel density estimator 


i n (x) - 


_! 

nh 


P 

E w< 

n 1*1 


X-X, 


2 

based on the Epanechnlkov kernel w(x) ■ . 75(l-x ) for |x|31. It should be 

1/2 - 1/2 

noted that minimizing ||f. -g 1 1 is equivalent to maximizing 

0 n 

#-1/2a1/2 , /A AV 

/ f 0 8 n dx. (2.3) 


In the earlier report Woodward and Eslinger (1983) examined the use of the 
MHDE in this setting by maximizing (2.3) using Newton's method. This 

8f l/2 3 2 f l/2 

recursive method involved the calculation of 8 and 9 , the 

36 ae 2 

forms of which are given in the Appendix of that report. 


3. New Implementations 

In the earlier report the smoothing parameter h n in the kernel 

-.271 

density estimator was taken to be c s- where c ■ 2.16n ' and 
J n 0 n 

6q was set to the starting value estimate of the component with the 
larger mixing proportion, i.e. 

s Q - 3j(0) - [ (p 1 (0) - rJ' 25) )/.6745] 2 




if p 2 .5 and 
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•o - a 2 (0) " l(r 2* 75> ” P 2 <0)/ - 6745 i 2 

if p < .5 where pj(0) i* the initial estimate of and i* the qth 

percentile from the jth cluster, J«l,2. We denote this value of h R as 

h^\ This value of h was chosen since h ■ c s with s the median 
n n n n n n 

absolute deviation, had been shown, Eslinger (1983) to be the optimal h^ 

when using a two-parameter normal projection model. In this report we 

examine the impact of two other choices for h n > The first modification to 

the earlier work was to take h ■ c s where c is as before and s is an 

n n m n m 

initial estimate of the mixture standard deviation given by 

s ffi - [p(0)(S 2 (0) + dj) + (l-p(O))(0 2 (O) + d^> J 1/2 
where d. ■ p.(0) - p (0), d, « p 9 (0) - p (0) with 

p tt (0) - P (OpjCO) + (1-P (0))P 2 (0) . 

This resulted in larger values for h n> especially in the cases in which 

there is substantial separation between the two components. 

Parzen (1962) has found the h^ which minimizes the integrated mean 

square error between a kernel density estimator and the true density f. 

His result shows that the h optimal in this sense is h ■ a(w)B(f)n 

n n 

where 


o(w) 


[ /w 2 ( y )dy ] 1 ^ 5 
[/w(y)y 2 dy] 2 ^ 5 


8(f) 


[/( 


2%Vdxf I/5 

3x 


(3.1) 


and 
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For the Epanechnikov kernel, o(w) ■ 1.71877. In the case In which f is a 

2 2 2 

mixture of normals as in (2.1), [9 f(x)/9x ] is given by 


2 2 

n (p ,c /V2)(z.-1) : 

9x^ 2ojV* 1 1 1 


♦ n(ix 2 ,o 2 /V2)(z 2 -l) : 


2c 2 Vi 


x-p 2 x-p 2 

. 2p( 1-p 2 ' o, ' 2 V a„ ' 

+ e 1 e 2 (Zj-lHZj-l) 

2. . «j 


x-Vj x-|i 2 

where z, * , z~ ■ and n(p,o) is the normal density with mean 

1 a 2 °2 

1 * (2) -1/5 

p and standard deviation c. We examined the use of h ■ 1.71877B(f)n 

n 

where 8(f) was approximated numerically using INSL's numerical integration 
routine DCADRE. 

As another strategy for improving convergence of the MHDE, an alterna- 
tive maximization technique Mas considered. Recall that the MHDE is 
defined to be a value D 1 which minimizes the integral 


j ,,.1/2 -1/2.2, 

1 ■ / (f e " 8 n } dx - 

This integral can be approximated using the trapezoidal rule by 


At i 1 Vi (f i / 2 (t i ) - g i / 2 (t i ))2 


(3.2) 


where a^ » a^ * 1/2 and a^^ *1 for i*2 , 3, . . . ,k-l for a partition 

tj ,t 2> . . . .t^ of [a,b] a finite interval, in our case taken to be the 

support of g , i.e., a ■ Y, -h and b * Y + h where Y. 

n In n n i 
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denotea the 1th order statistic. The procedure eaployed was to minimize 
the sum-of-squares in (3.2) using INSL routine ZXSSQ which eaploya the 
Marquardt-Levenburg algorithm (1963). 

4. Simulation Results 

In this section we describe the results of siaulations designed to 
test the procedures described in the previous section. Both the Newton 
and Marquardt-Levenberg algorithms were examined using h^^ and h^\ 
Simulations involve mixtures of normal and of non-normal components using 
the parameter configurations employed by Woodward and Eslinger (1983). In 
particular, we use mixing proportions .25, .50, and .75 and "overlap" as 
previously defined by Woodward, et al. (1984) of .03 and .10. As in the 
previous report we consider cases in which e< l U8 ^ 6 ^ ® n ^ ^2. In the 

present study we consider mixtures of normal and t(4) components while 
some results are given for mixtures t(2) components. For each set of con- 
figurations considered, 500 samples of size n-100 were generated from the 
corresponding mixture distribution. Simulations were performed on the IBM 
3081-D24 computer at Southern Methodist University. Starting values were 
obtained as discussed by Woodward, et al. (1984). For each sample 
simulated, the MT.F. and MCVMDE were obtained along with several MHDE esti- 
mators. MHD estimators employing Marquardt's procedure for minimizing 
(3.2) will be denoted MHDEM(i) where i«l or 2 depending on whether h^ 

or h^ were used in the density estimation. Likewise MHDEN(l) denotes 
n 

the estimator using h^ and employing the Newton's method for 
maximizing (2.3). The estimator MHDEN(2) was examined for selected 




configurations and was seen to not perform as well as MHDEN(l). Also 
shown are results labeled MHDEN'(l). These correspond to the MRDEN(l) 
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type estimates obtained by using starting values for 0 j and which are 
smaller by a factor of 1.2 than those obtained by the straightforward 
starting values given b> Woodward et al. (1984). Woodward and Eslinger 
(1983) showed that in the MHDE setting studied there, these smaller 
starting values produced better results. For means of comparison we have 
denoted by MHDE* in the tables the corresponding values obtained by 
Woodward and Eslinger (1983). These estimators were not obtained on the 
same sequence of samples as the current simulations. In Table 1 we pre- 
sent results for simulated mixtures of normal components, in Table 2 we 
show the corresponding results for simulated mixtures of t(4) components 
while in Table 4 we show a few results for simulated t(2) mixtures. 
Simulation based estimates of the bias and MSE associated with the various 
estimators are given by 

n 

1 S 

Bias - — — E (f^- p) 

s i-1 

n 

1 8 2 
MSE - - E (p - p) 

s i-1 

where n denotes the number of samples (500 in our case) and p 
s 1 

denotes an estimate of p for the ith sample. As in the earlier reports 
nMSE is given in the tables where n is the size of each individual sample 
(100 in our case). We provide empirical measures of the relative 
efficiencies of the various estimators with the MLE, by 
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* MSE (MLE) 

£ ■ • 

MSE (estimator) 

An approximate standard error of a tabled nM§E is (.0632) (nliSl). Also 
listed is the percentage of the 500 samples for which that estimator 
converged to "reasonable" values. For a discussion of "reasonable" values 
see Voodward, et al. (1984). If such convergence was not obtained, for 
purposes of the simulation study the estimate of p was taken to be the 
starting value. 

Examination of the tables shows that indeed the MHDE does appear to 
behave as expected, i.e. provides fairly efficient estimators under com- 
ponent normality (as evidenced by E values near 1) along with estimates 
more robust than the MLE for simulations of mixtures of non— normal com- 
ponents (as evidenced by E > 1). 

The percentage converging information is summarized in Table 3 where 
the value tabled for a given overlap is the average percent convergence 
obtained for the 10 configurations of normal and t(4) components in Tables 
1 and 2 for that overlap. All of the techniques proposed in this report 
produced estimators with higher rates of convergence than the estimators 
in the earlier study, especially for the .03 overlap. It is very clear 
from the table that the Marquardt based estimators are far superior to 
those using Newton's method in terms of percentage convergence obtained. 
It should be noted that convergence was almost always obtained in these 
settings by the MLE and MCVMDE. 
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Concerning NSE performance, initial exaalnatlon shows that MHDEN(l) 
provides smaller MSE's than the other MHDE technlquea for .10 overlap yet 
M1DEM(2) seems to provide the best eatiaates at the .03 overlap. However, 
due to the fact that in the .10 overlap case the starting value a tend to 
outperform all other estiaators and that about 16% of the "MHDEH(1) M 
results are actually starting values, this has a tendency to deflate the 
nMSE for HHDEN(l). As in the earlier study, the MHDEN'(l) (using the 
scaled starting values) seem to perfora better than HHDEN(l). 

It should be pointed out that about twice as much tiae is required to 
produce MHDEM(i) estimates as the NHDEN(i) estimates. It was mentioned 
by Woodward and Eslinger ( 3 '>8 3 ) that the time required for the MHDEN was 
comparable to that for the Cramer-von Mlses estimator. 

In Table 4 we show results for simulated t(2) components for 
configurations with a ^ / a 2 • 1. It is seen that again, in this 
extremely heavy tailed departure from component normality, the MHDE 
produced results markedly better than the MLE yet usually not as robust as 
the MCVMDE. Convergence in this setting was sometimes a problem for the 
MLE procedure which used the EM algorithm. However, the MHDEM(2) 
estimates converged at an extremely high rate. 

5. Concluding Remarks 

As a result of the present study it seems that the MHDE could indeed 
be a useful estimator of the mixing proportion of a two component 
mixture. Using the Marquardt procedure, the convergence problems no 
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longer exist although computer tlae it doubled. Work should be done in 
order to determine whether or not the tlae required to calculate the 
MHDEM(l) estiastes can be decreased. 

Paraaeter estiaates using the MHDE confora to predictable patterns, 
i.e. the estiaates are aore efficient than the Craafr~von Nlses estiaates 
under component normality, yet are not as robust at the Craafr-von Mites 
results. Because of the Increased percentage of convergence, the results 
obtained here provide a clearer picture of actual estiaator performance 
over the results given earlier by Woodward and Eslingar (1983). 

Whether the MHDE could be successfully used as an alternative to 
aaximum likelihood estimation of a crop proportion based on remote sensing 
data remains to be determined. However, the results shown here iaply that 
its successfully application is a possibility. 
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Table 1 . Simulation Results for Mixtures of Normal Components 
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Table 1. (Continued) 
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Table 2 - Simulation Results for Mixtures of t(4) Components 
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Table 3 - Percentage Convergence Obtained 
for Noraal and t(4) Components 

Overlap 



.10 

.03 

NHOEN(l) 

82.4 

95.8 

MHDEN'(l) 

85.6 

96.8 

MHDEM(l) 

99.5 

100.0 

MHDEM( 2 ) 

99.7 

100.0 

MHDE* 

82.1 

88.1 


Sample Size ■ 100 
Number of Replications ■ 500 
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